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Abstract 


This paper presents a novel method of identifying a state space model and a state es- 
timator for linear stochastic systems from input and output data. The method is primarily 
based on the relations between the state space model and the finite difference model for 
linear stochastic systems derived through projection filters. This paper proves that least- 
squares identification of a finite difference model converges to the model derived from the 
projection filters. System pulse response samples are computed from the coefficients of the 
finite difference model. In estimating the corresponding state estimator gain, a z-domain 
method is used. First the deterministic component of the output is subtracted out, and 
then the state estimator gain is obtained by whitening the remaining signal. Experimental 
example is used to illustrate the feasibility of the method. 


Research Associate, Mars Mission Research Center, Dept, of Mechanical and Aerospace Engineering, 
member AIAA. 

Associate Professor, Dept, of Mechanical Engineering and Mechanics, member AIAA. 
f Principal Scientist, Spacecraft Dynamics Branch, Fellow AIAA. 


1 



Introduction 


System identification, sometimes also called system modelling, deals with the prob- 
lem of building mathematical models of dynamical systems of interest based on their in- 
put/output data. This technique is important in many disciplines such as economics, com- 
munication, system dynamics and control 1,2 . The mathematical model allows researchers 
to understand more about the properties of the systems, so that they can explain, predict 
or control the behaviors of the systems. 

In automatic control of dynamical systems, in order to determine appropriate control 
force the controller design requires mathematical models of the systems. The quality of 
the model, therefore, will greatly affect the performance of the controller. Though a great 
variety of system identification methods have been proposed during the last few decades, 
still identification of systems of large dimension, or systems with noises in both input and 
output remains a difficult task. 1,2,3 

Because modem control theories are mostly developed based on state space description 
of systems, the model in state space format is preferred for control purpose. However, 
because the relation between input/output data and parameters in state space model is 
non-linear, if system identification chooses a state space model directly, the parameter 
estimation of the model becomes a non-linear optimization problem, which is difficult to 
solve in general. Usually iterative numerical methods should be resorted, but convergence 
and uniqueness of the solution are not guaranteed. On the other hand, if some special 
difference model which has linear relation between input/output data and parameters 
is chosen, the parameter estimation is a linear optimization problem, which has unique 
solution and can be solved analytically. The least-squares methods provide simple and 
powerful tools for solving linear optimization problems, either recursively or in batch. 
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Therefore, in general, difference models are easier to identify than state space models. 
However, for the demand of state space models in control applications, efforts have been 
made to convert a difference model to a state space model. 4 

Recently, a method is introduced in Refs. 5 and 8 to identify a state space model 
from a finite difference model, called the AutoRegressive with eXogeneous input (ARX) 
model, which is derived through Kalman filter theories. However, the requirement of large 
order causes intensive computation in the embedded least-squares operation. In Ref. C 
a method is derived to obtain a state space model from input/output data using the 
notion of state observers. This approach can use an ARX model with the order much 
smaller than that derived through the Kalman filter, but the derivation is based on the 
deterministic approach. In Ref. 7, it has been proved that as the order of the ARX model 
increase to infinity, the observer identification converges to the Kalman filter identification. 
However, for a stochastic system and a small order ARX model, to what the least-squares 
identification of the ARX model will converge in a stochastic sense is not clear. 

This paper addresses the above mentioned problem using the stochastic approach. 
The approach is primarily based on the relationship between state space models and finite 
difference models linked through the projection filter. 8 First, an ARX model is chosen, and 
then the ordinary least-squares is used to estimate the coefficient matrices. Based on 
the relationship between the projection filter and state space parameters the system pulse 
response samples, i.e., the Markov parameters, can be calculated from the coefficients of the 
identified ARX model. To decompose the Markov parameters into state space parameters, 
the Eigensystem Realization Algorithm (ERA) 9 is employed. ERA is effective in realizing 
state space model from system pulse responses, 10-14 which are the Markov parameters for 
discrete systems. 

To compute the state estimator gain a different method is developed in this paper 
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using a z-domain approach in contrast to the time-domain approaches used in Refs. 5 and 
7. After identifying a state space model, the deterministic part of the output is subtracted 
out. The remaining signed represents the stochastic part and can be modeled by a Moving 
Average (MA) model of which the coefficients are in terms of state space parameters 
and the Kalman filter gain. Identifying the MA model is approximated by identifying a 
corresponding AotoRegressive (AR) model first and then inverting it. From the identified 
MA model a state estimator gain can be calculated. Finally, the identification of a ten-bay 
structure is used to illustrate the feasibility of the approach. 

Relations between Projection Filter and 
Finite Difference Model of a Linear System 

The projection filter is a linear transformation matrix which projects (transforms) a 
finite number of input/output data of a system into its current state space. The image of 
the projection is an optimal estimate of the current state, and the filter is chosen such that 
the mean square estimation error is minimized. 8 Here ’’filter” is a generic term referring to 
a data processing procedure which extracts desired information from data. To explain the 
relation between the projection filter and the finite difference model of a linear system, we 
start from a simple case and gradually move to more general ones. 

Consider a finite- dimensional, linear, discrete-time, time-invariant, noise-free dynamic 
system, which can be represented by a state space model as 


Ax k T Bxt k 

(1) 

1 Ik = Cx k + Du k , 

(2) 


where x is an n x 1 state vector, u an m x 1 input vector, and y a p x 1 measurement or 
output vector. Matrices A, B> C and D are respectively the state matrix, input matrix, 
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output matrix and direct influence matrix. The integer k is the sample indicator. 


From Eqs. (1) and (2) it is easy to follow that 


Vk ■ 


■ C - 

Vk-i 

— 

C A~ l 

- Vk—q+l - 


. CA "» +1 . 


' -D 

0 

0 


‘ Uk “ 

0 

CA~ l B-D ■ 

0 


WJt-1 

. 0 

CA-t+'B •• 

• CA~ l B-D . 


- u k — q + 1 - 


or in short 

Yq t k — HqXk G qU k 

or in normal form 

HqX k = Yq t k + GqUk, 


( 3 ) 

( 4 ) 

( 5 ) 


where q denotes the number of data stacked up to form the equation, and the meanings 
of other notations are self-evident. If the state vector x* is the variable to be solved, 
Eq. (5) contains n unknowns and p x q equations. However, there are only at most n 
independent equations. Therefore, for a sufficiently large q (p x q > n) which can make 
H q full-column-ranked, the unique solution of x* is 


x k = F q (Y (I)k + G q U k ) } ( 6 ) 

where 

= (7) 

is the pseudo-inverse of H q and also the projection filter in this case. If p x q = n (i.e., H q 
is square), F q becomes H~ x . In general the number q can be any integer bigger than an 
integer q m i n which is the minimum number required to make H q full-column- ranked. The 
solution fjt is identical with the true value Xk- 
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To write a difference model of the system which expresses the current output as a 
linear transformation of finite previous input/output data, one can use Eqs. (1), (2) and 
( 6 ) 


Vk — Cxk + Duk 

= CAxk-x + CBu k -i + Duk 
= CA [F 9 (r, |fc _i + G q Uk-i)] + CBuk-i + Duk 

« 7 

= J2 CAF v Vk-> + Du k + (CB - CAF ql D)u k -i + CAF q G qi u k -i 

1 = 1 i=2 

7 7 

= + J2 DiUk ~" ( 8 ) 

« = 1 1=0 


where Fg X (G R p ) and Gq X (G F pX(iXm ^ are the z-th partitions of Fq and respectively, 
defined as 


F « = 


F q 1 : F q 2'. 


:F 


77 


G q = 


Gq\. G q 2'. 


:G 


77 


( 9 ) 


and A, = CAF qi , (* = 1 , B 0 = D, B x = CB-CAF ql D, B { = CAF q G qi , (i = 

2, • • • , q)- The model described by Eq. (8) is an ARX model. 


Next, consider a system without process noise but with additive, white, gaussian, and 
zero-mean measurement noise which is not correlated with the state variable, the output 
equation becomes 

yk = Cx k + Du k + v k , 

where Vk represents the measurement noise. Similarly, we can derive a matrix equation 


H q Xk — Y qi k + G q Uk — Vq t k, 

where V£ k = [vj, The unknown variable x k is still a deterministic 

variable in this case. By the theory of parameter estimation for deterministic parameters 
from a linear equation with independent white noise, 16 one can write the optimal estimate 
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of Xk as shown in Eq. (6) with 


F q = (HjR- l H q )~ l H^R- 1 (10) 

which is a weighted pseudo-inverse of H q and 7? = R ® I q , ® is the Kronecker product, R 
the covariance of the measurement noise, and I q the identity matrix of dimension q. The 
optimality is defined by the minimum variance of state estimation error. 

To derive an ARX model using the relation provided by the projection filter is similar 
to the previous case. We can form a one-step-ahead output prediction using the last 
estimated state as 

Hk = CAik-\ + CDu jt_i + Duk (11) 

and define 

Vk = Vk + r)k (12) 

where rfr k is the prediction error. Therefore, 

Vk — CAx jt — i + CBu k -\ + Du k + rjk 

= CA [F q (Y qi k-i + G q Uk-i)} + CDuk-\ + Duk + rjk 

A 9 

= CAF q iyk-i + Duk + ( CB — CAF q iD)uk~i + CAF q G q jUk-j + rjk 

*=1 i =2 

9 9 

= ^2-^iVk-i + ^ B{Uk-i + r/k, ( 13 ) 

1=1 1=0 

where F q i and G q i are defined the same way as in Eq. (9), but F q is defined by Eq. (10) 
in this case. 

Next, consider a more general case for a system with both process and measurement 
noises. In state space format the system can be modeled as 

Xk+i = Ax k + Bu k + w k (14) 

Vk = Cxk + Duk + Vk, (15) 
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where the sequences {uq. } and {tq} are the process (input) noise and the measurement 
(output) noise, respectively. Both are assumed to be gaussian, zero-mean and white with 
covariance matrices Q and 72, respectively. They are also assumed statistically independent 
of each other. 

Similarly, by writing the previous output in terms of the current state using Eqs. (14) 
and (15), one can derive 

Y q<k = H q x k - G q U k - M q W q> k + V q>k , (1C) 

where 


Mq = 

0 

C.4" 1 • 

0 ' 
0 

. W q , k = 

W k -i 


.CA-1+ 1 • 

• • C.4 -1 . 


. w k-q + l . 


Equation (16) can be further simplified to be 

H q x k = Y^ *+£,,* (17) 

where 

Kk = + G “ U £».* = M q W qtk - v q , k . 

Note that the unknown variable x k is a random variable in this case. The overall noise 
vector £ q>k is still gaussian and zero-mean because W qyk and V qyk are gaussian and zero- 
mean. It is also correlated with the unknown variable x k because W q<k is correlated with 
x k . Denote the covariance between x k and £ q k by P Z £. For a linear equation like Eq. (17), 
suppose the mean of the current state x k and its variance P z is given, by the theory of 
random parameters estimation 8 ’ 16 the optimal estimate of a:* can be obtained by 

Xk =x k + F q (Y q k - Y' ik ), (18) 

where the overbar denotes the expectation value, 

Y q,k = HqX k , 
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and 

P<i = {P*Hj + P x t)(H q P x Il7 + H q P ti + P^H? + J^)- 1 (19) 

is the projection filter in this case. Matrix R £ denotes the covariance of The optimality 

is under the minimum variance of state estimation error. 

Similarly, to derive an ARX model, we can use one-step-ahead output prediction as 
Eq. (11) and have 

yr- = CAZk-i + CDu^-i + Du k 

= CA [**_! + F q (V q \ k _ 1 - ?;,*_!)] + CDu k _ x + Du k 
= CAF q Y q> *_! + CBu k -. x + CAF q G q U k - x + CA(I n - F q H q )x k - x + Du k 
= CAF q iyk-i + Duk + ( CB — C AF q iD)uk-\ 

1=1 

9 

+ ^ + CAXiijt-! , (20) 

1=2 

where 

L = I n — F q H q , 

Fqi and G q i are again defined the same way as in Eq. (9) but F q is defined by Eq. (19) 
instead. 

Equation (20) represents the best prediction of y* one can make using q previous 
input/output data. If the prediction is made once and for all, namely, no prediction 
of previous state is made, the best value assigned to is zero. However, if previous 
state estimation has been carried out, the best choice for x k is the a priori Kalman filter 
estimation. Note that for the Kalman filter 

i k — l ~ — 2 "f" AI\ ( J/fc _2 — CXf ,^ 2 Dllk- 2) + BUfc—2 

9 ~ 1 9-1 

= J2A'- 1 AKy k . 1 . l + ^2A l -\D-AKD)u k . 1 . i + AH k - <n (21) 

i=i 1=1 
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where 


A = A(I n - KC ), 

A is the optimal steady state Kalman filter gain. Based on the argument above, we 
can replace x k - x in Eq. (20) by Eq. (21) and obtain 

Vk = Vk + ilk 

<1 

= CAF ql y k - x + ]T CA (F qt + LA'~ 2 AK) y fc _, + Du k + (CB - CAF ql D)u k _ x 
1=2 
9 

+ Y2 CA ( FgGgi + LA'~ 2 (B - AKD )) u k -i + rj k 

i — 2 

9 9 

= Y2 AiVk-i + B,u k -., + ?4_, (22) 

1 = 1 i=U 

where q' k = rj k + CA q x k - q . Note that if q is not large, {^} is not white. Equations (8), 
(13) and (22) represent the AutoRegressive with eXogeneous input (ARX) models of linear 
systems in various different noise situations. The equation in each case provides a best 
prediction of the output measurement at time k in the sense of minimum state error at 
time k — 1 using q previous input and output data. 

Least-Squares Identification of ARX Model 


A general ARX model of a linear system can be written as 

9l ?2 

Vk = Yl AiVk-i + B * U k — i + CJb, (23) 

i=l i=0 

where (yl,y2) is the order of the model. Given a set of input and output data 
VQi ^o} of the system, we can use the least-squares method to find a 

set of matrix coefficients {Ai, • ■ • , A 9l , Bq, • • • , B q 2 } which fits the equation and the data 
“best” under least-squares error of output prediction sense. The least-squares method 
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for single-input single-output ARX model (a scalar equation) can be found in many text 
books. f The extension to multi-input multi-output model is straightforwards.^ 

The ARX models derived in the last section have an order (q y <y), or just q in short, 
which are special cases of the general ARX model. It can be proved (see appendix) that 
if we choose an ARX model of order q and use the least-squares method to identify the 

parameters of the model, the parameters will converge to that derived from the projection 
filter. 


Obtaining System Markov Parameters from ARX Model 

There are some special relations between the system pulse response samples, i.e., the 
system Markov parameters, and the coefficient matrices of the ARX models derived in the 
previous section. Based on these relations we can obtain the system Markov parameters 
from the ARX models. 

For noise-fiee systems, from Eq. (8) if we denote the coefficient matrices of yk—j and 
u k-j by Aj and Dj , respectively, we can have 

j 

CA>B = Bj+i + £ A t CA^ { B + A j+1 D. (24) 

1-1 

Note that this equation can calculate the system Markov parameters CA j B (;' = 1, • . • , q- 
1) iteratively from the coefficient matrices of a ARX model of order q (note B 0 = D and 
CB = B\ +A l D). 


Proof: 


By definition 


G q i = [— D t , 0 j 


■>TiT 
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and for j > 2 


0 


' C ‘ 


CN 

1 

o 

■ 


‘ 0 ' 

CA~ l B - D 

= 

CA~i+ 3 

CA-1+ 1 

A’~ 2 B - 

c 

0 

B - 

0 

D 






0 



CA-1+ 1 _ 








0 









. 0 . 


= - E h ~ 2 ‘B - D lj) 


(25) 


where 


EV-*=[(CAi-y, C T , 0 T , ■••O 7 '] 7 ', 

=[ 0 7 ',--., 0 T , £> T , oV-, 0 7 '] 7 ', 

has D in the j-th block and is zero elsewhere. Therefore, with j > 1 

CAF q G q (i+1) = CAF q H q A J ~ 1 B - CAF q E^B - CAF q(j+1) D 

j 

= CAF q H q A J ~ 1 B - CAF qx CA>~'B - CAF q(j+1) D (26) 

»-i 

It is noted that Eq. (26) also holds for the system with both process and measurement 
noises. Now, because 

F 't H q = (27) 

and from Eqs. (8) and (26), we have 

, ^ 

Bj+i = CAF q G qU+l) = CA>B - ^^CA^B- A j+1 D. 

i=i 

So, Eq. (24) follows. 


Q.E.D. 

We can also iteratively calculate CA>B ( j = q, q + 1, ...) by 

CA’B = CA(F q H q )A j ~ 1 B 

= J2 A ‘ CAj ~'B. 

i - 1 


( 28 ) 



Though derived from noise-free systems, the above equations (Eqs. (24) to (28)) also 
hold for systems with additive white measurement noise. Because for systems with white 
measurement noise the projection filter F q is nothing but a weighted pseudo-inverse of H q , 
hence Eq. (27) also holds. 

It is interesting to see that Eq. (24) also holds for systems with both process and 
measurement noise even though Eq. (27) does not hold in this case. This can be proved 
as follows. 

Proof: 

Be definition (see Eq. (22)) 

3 

Bj + 1 T 'y "] AjC A J ‘B -\- Ajj r \D 

1=1 

— CA(F q G q (j+i) + LA 3 1 (B — Ah D )) + CAF q xCA 3 ~ x B + CA{F q z + LAI()CA 3 ~ 2 B 
+ • • ' + CA(F qj + LA 3 ~ 2 AK)CB + CA(F qU+1) + LA 3 ~ X AK)D 

J 

= CAF q G q(j+l) + CALA j ~\B - AKD) + 

i -- 1 

3 - 1 

+ CALA'-'AKCAi-'-'B + CA(F qU+1) + LA 3 ~ X AK)D 
1 = 1 

i 

= CAF q H q A 3 ~ x B - CAF qi CA 3 ~'B - CAF qU+l) D + CALAIS 

1=1 

j J “ 1 

- C ALA 3-1 AKD + ^>2 CAF qt CA 3 ~‘B + CALA^AKCA^^B 

1 = 1 1 = 1 

+ CAF qU+1) D + CAL A 3 ' 1 AKD 

= CAF q H q A 3 ~ x B + CALA j ~ x B + CALA j ~ 2 AKCB 

J-2 

+ ^ CALA i - x AKCA j ~ i - x B 

1=1 

3 ~ 2 

= CAF q H q A 3 ~ x B + CALA 3 ~ 2 (A + AKC)B + y^CALA i - 1 AKCA j - i - x B 

1=1 
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J-2 

= CAF q H q A 3 ~ l B + CALA 3 ~ 2 AB + C ALA^ 1 ARC A 3 ~'~ l B 

i ~ : 1 

= CAF q H q A 3 ~ l B + CALAA 3 ~ 2 B + CALAKCA 3 ~ 2 B 
= CAF q H q A 3 ~ x B + CALA 3 ~ l B 
= CA{F q H q +L)A 3 ~ l B 
= CA 3 B 


(29) 


where the relations A + ARC = A, F q H q + L = /„ and Eq. (26) are used. 

Q.E.D. 

However, Eq. (28) does not hold for systems with process noise. Hence, for an ARX 
model of order q, only q terms of the system Markov parameters can be obtained. A 
system of order n has only n independent Markov parameters; all the rest are the linear 
combination of these n independent ones. Therefore, the order chosen for the ARX model 
q should be greater than or equal to n. In general, more Markov parameters can improve 
the accuracy of the identified state space model in the later procedure; however, a trade-off 
is that a larger q will increase computational load. 

To decompose the identified Markov parameters into state space parameters [A, Z?, C], 
one can use the Eigensystem Realization Algorithm (ERA). ERA is a simple and accurate 
algorithm for identification of linear systems from pulse response samples. It has been 
proved valuable for modal state parameter identification from test data. 13,14 The algorithm 
uses the pulse response samples (i.e., the Markov parameters for a discrete-time system) 
to form a large block data matrix which is referred to as the general Hankel matrix. Then 
the technique of singular value decomposition is used to decompose the Hankel matrix. 
The system order is determined by counting the number of singular values retained. The 
small singular values are attributed to noises and are truncated. The state space model 
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can be computed from the decomposed matrices. The realized model is not unique, but 
the Markov parameters are unique. For further details, the readers are referred to Ref. 9. 


Identification of a State Estimator Gain 


After obtaining a set of state space parameters via the ERA, a corresponding state 
estimator gain can be estimated. This method is enlightened by the Kalman filter theory. 15 
From the Kalman filter formulations, a filter model in innovation form is 19 

£ T+i = A H + Bu k + AK k e k (30) 

y k = Cx~ + Du k +e* ( 31 ) 

where x ^ is the optimal prediction made by the Kalman filter based on all the data prior 
to the moment k, and K k the Kalman filter gain; the quantity e k , called residual, is the 
difference between true output y k and predicted output y k (= Cx ^). In steady state the 
filter gain is constant and the subscript of it can be omitted. Equations (30) and (31) are 
called “innovation model”, because the quantity e k is also called “innovation” because in 
a sense it contains new information which can not be obtained from previous data. For an 
optimal Kalman filter, the sequence {£jt} is white , 19 which is a useful property. 

From the innovation model, the Kalman filter can be viewed as driven by deterministic 
input u k through B and by stochastic input £ k through AK. Hence, the filter state and 
output can be decomposed into two parts — one caused by the deterministic input and the 
other caused by the stochastic input. Accordingly, the innovation model can be divided 
into two models: 


*+i,i ~ A Bu k 

(32) 

Uk, i = Cx^ A + Du k 

(33) 

15 




and 


(34) 

(35) 


*k + 1,2 ~ ^k,2 + AKkek 
Uk,2 = Cx i ti2 + £* 

where x k = rr^j + x A . 2 end y* = t/* fl + y k2 . Expanding Eqs. (33) and (35) based on 
Eqs. (32) and (34), respectively, one can derive 

k 

Vk,\ = Du k + 'y' j CA t ~ l Buk-j, (3C) 

i-i 

k 

Vk,2 = e k + y ^CA'Ke k _i. (37) 

i - 1 

Combining the above two equations, one obtains 

* k 

y k = Du k + CA'- l Bu k _i + e k + ^ CA'Ke k ^. (38) 

<-l 1=1 

Equation (38) clearly shows the two parts of which the output is composed. If the accu- 
rate state space parameters [A, B, C , D) are known, one can subtract the deterministic 
component y kA out from the output y k ; that is, by defining s k =y k>2 = y k - y k u then 

k k 

*k = £k + ^ CA'Ke k -i = ^ Ci£k-i , (39) 

«=1 i =0 

where Co = I p , C{ — CA'K. The remaining signal s k represents the stochastic component 
and is driven by the sequence {t fc }. For a stable system all the terms CA'K , i > q , are 
negligibly small when q is sufficiently large; therefore, when k is large the upper limit 
of the summation on the right side of Eq. (39) can be replaced by q. Equation (39) 
describes s k as a linear transformation of a white sequence {e fc }; therefore, it is called a 
Moving Average (MA) model . 18,19 The matrices C\, • • • ,C q are constants, called the MA 
parameters. The term moving average arose because s k can be regarded as a weighted 
average of £*, - • • , £ k -q. Note that the MA parameters are expressed in terms of the state 
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space parameters A, C and steady state Kalman filter gain K. If the MA parameters are 
known one can compute the filter gain from thorn. 


The problem of estimating the MA model in Eq. (39) is that the white sequence {£*} is 
not readily available; therefore, ordinary least-squares methods frequently used estimating 
the coefficients of linear equations can not be used directly. However, we can estimate 
a corresponding autoregressive (AR) model first, and then invert it to approximate the 
original MA model. To highlight this point, taking z-transform of both sides of Eq. (39) 
to become 

9 

S=]Tc i z- i E = M(z- l )E, (40) 

i~0 

where M(^z ) is a polynomial matrix in z * (a matrix whose entries are polynomials in 
z -1 ). Matrix M(z~ l ) can be regarded as a filter which receives e k and its delayed versions 
as the input and yields $ k as the output. If we can find the inverse filter N(z~ l ) of M(z ~ l ) 
such that N(z~ 1 )M(z~ 1 ) = I p , by pre-mult iplying Eq. (40) with we have 


N(z~ 1 )S = E. 


(41) 


Matrix N(z *) in general is an infinite-ordered polynomial matrix in In Eq. (41), 

N{z a ) can be viewed as a whitening filter which receives s* and its delayed versions as 
the input and yields white sequence {ek} as the output. 


To obtain a whitening filter for the signal s^, we can write a AutoRegressive model of 
s k with order r in time domain as 


^Nisis-i = £ k , 


(42) 


i=U 


where N 0 = J p , and estimate the AR parameters N i} • • • , N r . 18 Comparing Eq. (41) with 
Eq. (42) it can be seen that the infinite-ordered polynomial matrix TV^z” 1 ) is approximated 
by a finite-ordered polynomial matrix y^ r _ Q N t z~ l . The parameter estimation of the AR 
model can be accomplished using the ordinary least-squares method. 


17 



After obtaining an identified N(z ! ), we can invert it to approximate In- 

verting a square polynomial matrix is similar to inverting an ordinary square matrix (a 
matrix with scalar entries), and the result is the adjoint matrix divided by the determinant 
of the matrix. In the operation, multiplying two polynomials is equivalent to convoluting 
the coefficient sequences of the two polynomial, dividing two polynomials is equivalent 
to deconvoluting the coefficient sequence of the numerator polynomial over that of the 
denominator, expanding to the number of terms desired. 


After obtaining the estimated MA model and collecting ql coefficients, one can form 
a matrix 


M = 


and the least-squares solution of K is 


' CAK ■ 
Ca 2 K 

.CAAK. 


( 43 ) 


K = (JJ T II)- 1 II T M = H'M 


( 44 ) 


where 


H={(CAf, (CAY, --ACA’Yf (45) 

is an observability-like matrix, which is full-column-ranked for an observable system; 
is the pseudo-inverse of H . 


Because of the approximation used in the process, K is not a real optimal Kalman 
filter gain; however, it represents an identified state estimator gain (or suboptimal Kalman 
filter gain). The quality of the identified gain relies on the accuracy of the identified state 
space parameters and the order of the whitening filter r. If the identified state space model 

is accurate and the order r is chosen large enough, the identified gain will converge to the 
optimal steady state Kalman filter gain. 


Experimental Example 


IS 



An experimental example is used to demonstrate the feasibility of the integrated 
system identification and state estimation method developed above. A ten-bay structure 
as shown in Fig. 1 is considered. The truss is one of the structures built in NASA Langley 
Research Center for experiments in studies of control and structure interaction (CSI). It 
is 100 inches long, with a square cross section of 10 in x 10 in. All the tubing (longerons, 
battens, and diagonals) and ball joints are made of aluminum. The structure is in a vertical 
configuration attached from the top using an L-shaped fixture to a backstop. Two cold air 
thrusters acting in the same direction are placed at the tip. The thrusters which are used 
for excitation and control have a maximum thrust of 2.2 lb each. A mass of approximately 
20 lb is attached at the beam tip to lower the fundamental frequency of the truss. Two 

servo accelerometers located at a corner of the square cross section provide the in-plan tip 
acceleration. 

The structure was excited using random inputs to both thrusters for 30 seconds. The 
input signals were filtered to concentrate the energy in the low frequency range. A total 
of 7499 data points at sampling rate 250 IIz is taken. The two output acceleration signals 
were filtered using a three-pole Bessel filter with a break frequency of 20 Hz. 

From the output we can tell the dominant mode is about 5 to 6 Hz. To avoid using 
too large order in the least-squares filter the sampling rate is reduced to 1/2 of the original 
one by choosing one out of every two samples. Hence, the sampling rate becomes 125 Hz 
and totals 3750 data. The order of the ARX model is set to 100. Figure 2 shows the 
identified system Markov parameters CAp'B (i = 1, • ■ • , 100). By the ERA three modes 
are identified and the identified modal frequencies and dampings are fisted as follows: 

Frequency (rad/sec) Damping (%) 

37.09SS 0.27 

46.1175 2.87 


Mode 

1 

2 
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3 


304.4S17 


0.40 


The corresponding state space parameters in normalized modal format are 
.A = ding 


0.9555 0.2922 

-0.2922 0.9555 


0.9229 0.35G8 

-0.35C8 0.9229 


-0.7538 0.6423 

-0.6423 -0.7538 



[ 0.1725 

-0.1117 

0.1122 

-0.0321 


L- 

0.17S9 

0.1267 

-0.1522 

0.0556 

C : 


'1.7754 

0.0000 

1.0023 

0.0000 



0.9201 

0.0362 

-1.G909 

-0.3692 




D = 

[ -0.0042 

0.0165' 





! 0.0215 

0.0009 


T T 


0.3309 -0.2725 

1.3946 0.0000 


The identified state space parameters are used to estimate the corresponding Kalman filter 
gam. The identified stochastic Markov parameters CA'K ( i = 1, • • • , 100) are shown in 
Fig. 3, and the estimated state estimator p'ains is 


0.27S7 0.1S07 

0.2277 -0.0CS5 


0.3437 

0.1236 


0.1072 

-0.0884 


0.0065 

-0.0874 


0.0357 

-0.0182 


To show the results of state estimation, the first state of each mode is shown in Fig. 4. 
Since the modal model has been normalized, the amplitude of each modal state indicates 
the energy allocated in that mode. To evaluate the quality of the system identification 
and state estimation, the estimated outputs calculated based on the estimated state are 
compared to the true outputs. Because the true state is not available, output comparison 
is the only way to validate the results. The comparison of the first output is shown in 
Fig. 5, where we can sec the estimated and the true outputs are in good agreement. The 
covariance of the error is less than 1.5 % of the covariance of the output. 


Concluding Remarks 


In contrast to most existing system identification methods of which the great major- 
ity use deterministic approach, the method developed in this paper is derived under the 


20 



stochastic framework, taking into account the effects of process noise as well as measure- 
ment noise. The use of projection filter to derive a state space model provides stochastic 
insight into the model. The accuracy of the identified state estimator gain relies on the 
accuracy of the identified state space model and the order of the whitening filter. The 
order of the whitening filter is not necessary to be equal to the order of the system. The 
larger the filter order is the whiter the residual will be. If the identified model is accurate 
and the order of the whitening filter is sufficiently large, the identified gain converges to the 
optimal steady state Kalman filter gain. An experimental example shows the feasibility of 
the method. 
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Appcizdix 


In the appendix, we want to show that for a linear time-invariant system described by 
a known state space model, the system output predicted by an optimal state (in the least- 
mean-square sense) based on the system model is also optimal (in the least-mean-square 
sense) if the number of the stationary data sample is sufficiently large. In other words, 
though the projection filter is derived based on the criterion of resulting least-mean-square 
state error, it also provides least-squares output error. 

An ARX model of order q of a linear time-invariant system can be written as 

yk = ^2 A iVk-i + Bi U k- > + £k = Vk + £k (Al) 

i=l i=0 

where £* is the output error. Given a set of input/output data {i/*, • • • , y Q , Uk, • • • , u 0 } 
of the system, we can use the least-squares method to find a set of matrix coefficients 
{Ai, • • • , A q , Bo, • • • , Bq} which fits the equation optimally in the least-squares error of 


23 


output prediction sense. It minimizes a scalar cost function J\ , defined by 

k 

Jl = _ yifivi - y.)] = (A 2) 

*=¥ 1 = 7 

However, assume £,• is stationary and k is sufficiently large, 

Ji = (k -q + 1) « (* “ 9 + 1)£[4^], (A3) 

for a stationary random process the sample average can represent the ensemble average 
(expectation) due to its ergodic property. Therefore, minimizing J x is equivalent to mini- 
mizing mean-square output error. 

On the other hand, the projection filter provides a least -mean- square state estimate 
*k - i of from a set of previous input/output data Y T = • • • , y'f, u jT . . . f u Tj 

of the system. Similar to y k , the estimated state x*_ a is also a linear combination of 
previous input/output data and can be shown as (see Eq. (20)) 

XL - 1 = FY 

where F is the projection filter. The optimal estimated state minimizes a scalar cost 
function J 2 , defined by 


(44) 


( 45 ) 


■ J 2 - E[(xk-i - Xk-i) T (x k -i - £ jt_i )] 

= E[(x k . x - FY) T (x k _ x - FT)] 

= trace £[(**_! - FY)( x k -i - FF) T ] 

= trace E[x k - 1 xJ'_ 1 - x k - X Y T F T - FYx J k ’_ x + FYY T F T ] 

with 

dJ'i djo 

diZ 7 = 0 or If = ~ 2E[ x ‘-i y 1 + mE[YY T ]) = 0 

f(£P'Kq)-£[ It .,yq = o 
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The optimal predicted output y k can be derived by using the one-step-ahead prediction 
from the optimal state x k -i as shown in Eq. (11). Prom Eqs. (11) and (15) one can obtain 

= yk ~ Vk — C A(x k _i — FY ) + Cw k —\ -f- v k 

and 

= trace E[e k e J’] 

= trace E[CA{x k _ x - FY){x k -i - FY) T A T C T ] + trace[CQC T + R] 

Because the process noise covariance Q and the measurement noise covariance R are con- 
stant, we have 

dR dr 

dF = dF { traC€ E i CA ( x t-i ~ FY)(x k _ , - FY) t A t C t ]} 

= - 2A T C T CA(E[x k - 1 Y T ]) + 2A t C t CAF(E{YY t \) 

= 2A t C t CA{F{E[YY t ]) - E[x k _ 1 Y T )} 

Prom Eq. (A5), we get d.J x jdF = 0. This proves that the predicted output y k shown 
in Eq. (11) from the projection filter is also the lest-squares output. The coefficients in 
Eq. (22) should be equivalent to those in ARX model (Al) derived from the least squares 
method if the number of data is sufficiently large. 



Figure Captions 


Fig. 1 Ten-bay truss structure test configuration. 

Fig. 2 Identified system Markov parameters CA l ~ l B (the (1,1) element). 
Fig. 3 Identified stochastic Markov parameters CA^K (the (1,1) element). 
Fig. 4 Estimated Modal states. 

Fig. 5 Comparison of true and estimated outputs. 
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